|
|
Signal to Noise Ratio Determination |
| Tags | pre-process☁quality☁noise☁snr |
SNR, standing for Signal to Noise Ratio , is an estimation of how much noise there is in a digital signal. All digital signals can be divided in two main elements: the relevant data that we meant to record and the non relevant noise that got unavoidably recorded along with it. An ideal signal would have a very high SNR value. The higher the value, the less noise there is in the signal. However this is not always easy to achieve in real life.
Because the presence of noise usually shadows the "real" and relevant data we want to analyse, it is important to be able to filter as much noise as possible without unwillingly removing relevant pieces of information. There is already a really interesting
Jupyter Notebook
explaining the process of digital filtering electrophysiological signals, which you can find
here
.
In this Jupyter Notebook , however, we will focus on how to estimate the SNR in a digital signal, therefore determining the quality of the recording.
1 - Importation of the needed packages
# biosignalsnotebooks own package for loading and plotting the acquired data
import biosignalsnotebooks as bsnb
# Scientific packages
from numpy import array, linspace, mean, log10
from scipy.signal import butter, filtfilt
2 - Load of sample signal data
# Load of data (using a relative path to the project folder)
data, header = bsnb.load("../../signal_samples/eeg_sample_artefacts_seg1.h5", get_header=True)
3 - Storage of sampling frequency and acquired data inside variables
Since in this case we know that the original signal is an EEG recording, we can also convert its units to $\mu V$ following the method explained in the EEG Sensor - Unit Conversion# Sampling frequency and acquired data
fs = header["sampling rate"]
# Let's get de raw signal data and the time range of the recording
channel = list(data.keys())[0]
signal_raw = data[channel]
time = linspace(0, len(signal_raw) / fs, len(signal_raw))
# Let's convert the signal's units, since we know it is a ECG signal
vcc = 3e6 # uV
gain = 40000
resolution = header['resolution'] # Resolution (number of available bits)
signal = (((array(signal_raw) / 2**resolution) - 0.5) * vcc) / gain
4 - SNR estimation
This notebooks" proposed approach to estimate the SNR consists of four phases:
4.1 - Signal smoothing
This step depends much on the characteristics of the signal we are dealing with. Different types of signals and acquisitions with varying degrees of quality often require different strategies to effectively reduce the noise components without losing relevant information. In the same way, the quality and accuracy of the filtering will determine the SNR value.
In this tutorial we will follow a conservative EEG filtering using a Butterworth bandpass filter to remove frequencies lower than 1 Hz and higher than 50 Hz, which are usually outside the range of brainwave frequencies. There is, however, a more detailed
Jupyter Notebook
dealing with the details of EEG filtering:
Digital Filtering - EEG
.
# Time window to center our processing
sample_start = 0
sample_end = 30*fs
# Baseline shift of window
signal_shift_window = array(signal[sample_start:sample_end]) - mean(array(signal[sample_start:sample_end]))
# Butterworth bandpass filter
low_cuttoff_wide = 1# lower cutoff frequency for bandpass filter (Hz)
high_cuttoff_wide = 50 # Upper cutoff frequency for bandpass filter (Hz)
# Creating a bandpass Butterworth filter
b, a = butter(2, [low_cuttoff_wide, high_cuttoff_wide], btype='bandpass', fs=fs)
bandpass_filtered = filtfilt(b, a, signal_shift_window) # filter with filtfilt to obtain a zero phase response
4.2 - Subtracting filtered signal to the original signal
In order to isolate the noise removed in the filtering process.# Let's subtract our filtered signal from the original signal
noise_component = signal_shift_window - bandpass_filtered
The big oscillations we can see in the noise component correspond to the lowest filtered frequencies (< 1 Hz). If you take a closer look at the red signal, you can see that it actually contains many small high frequency oscillations too, which correspond to the high filtered frequencies (> 50 Hz).
4.3 - Calculating the signal"s and noise"s peak-to-peak amplitudes
ptp_signal = max(signal_shift_window) - min(signal_shift_window)
ptp_noise = max(noise_component) - min(noise_component)
4.4 - Estimating the SNR value using the peak-to-peak amplitudes
The calculation of the SNR is as follows: \begin{align} \mathrm{SNR_{dB}} = 10 \log_{10} \left ( \frac{P_{signal}}{P_{noise}} \right ).\\ \end{align}where $P_{signal}$ stands for the power of the original signal and $P_{noise}$ for the power of the noise component.
The logarithm allows to convert the SNR ratio, which has no units, into decibels (dB). This is done for convenience, because the SNR can have very large values. Converting the ratio to a logarithmic scale makes it much easier to manage.
# Estimation of SNR using the peak-to-peak amplitude of the signal and the noise component.
snr_peak_to_peak = 10*log10(ptp_signal/ptp_noise)
A SNR value higher than 0 dB indicates that there is more relevant signal than noise. However, in this case, although the value is positive, it is also low. That means that the filtered noise is significantly shadowing the real information. Nonetheless, in this case the SNR value will change depending on the cutoff frequencies chosen for the bandpass filter, so it is very important to analyse our data and chose the optimal way to extract the noise. Obviously, more robust filtering methods ensure a more accurate estimation.
Furthermore, it is necessary to differentiate between filtering used to remove noise and filtering used to extract the data we want to focus in. In the case of EEG, for example, the analysis of the alpha bands usually requires the isolation of frequencies between 8 and 13 Hz, but that does not mean that everything outside that range is noise. There may be other real information that is not useful for our current analysis and therefore must be ignored, but not classified as noise. So it is always important to determine what is to be considered as noise in the signal we are working with.
In this example, the cutoff frequencies where chosen because in EEG, frequencies below 1 and higher than 50 Hz are prone to contain more noise than useful data (with exceptions, of course).
Usually the acceptable value for SNR depends on the type of the signal and the use we are going to give to it, but as a general rule of thumb the higher it is, the better the quality of signal.
We hope you have enjoyed this tutorial!
You can keep learning thanks to the other available
Jupyter Notebooks
that explain how to process, analyse and extract features from electrophysiological signals.
Check the list of all available notebooks
here